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We study the beneficialness of quantum strategies in multi-player evolutionary games. We base 
our study on the three-player Prisoner's Dilemma (PD) game. In order to model the simultaneous 
interaction between three agents we use hypergraphs and hypergraph networks. In particular, we 
study two types of networks: a random network and a SF-like network. The obtained results show 
that in the case of a three player game on a hypergraph network, quantum strategies not necessarily 
are Evolutionary Stable Strategies. In some cases, the defection strategy can be as good as a 
quantum one. 
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I. INTRODUCTION 

Game theory is a branch of mathematics broadly ap- 
plied in a great number of fields, from biology to social 
sciences and economics, a great deal of effort has gone 
into the study of evolutionary games on graphs, which 
was initiated by the work of Nowak and May V s . Since 
their work was published, a lot of effort was put into 
studying the problem [2J. 

Quantum game theory [3] allows the agents to use 
quantum strategies. The set of quantum strategies is 
much larger than a classical one; hence it offers possi- 
bility for much more diverse behavior of agents in the 
network. It has been shown that if only one player is 
aware of the quantum nature of the system, he/she will 
never lose in some types of games Recently, it has 
been demonstrated that a player can cheat by appending 
additional qubits to the quantum system [5]. 

Combining evolutionary games and quantum game 
theory, has resulted in absorbing results [6 . In some 
cases the quantum strategies can dominate the entire net- 
work, infecting it effectively. In our work we like to focus 
on introducing additional strategies which the agents can 
use, since in the multi-player case there exists a Pareto 
Optimal Nash Equilibrium for the Prisoner's Dilemma 
game [7]. Moreover, the PD game is interesting to study, 
because it was realized experimentally [8]. 

This paper is organized as follows: Section [n] de- 
scribes the types of 3-hypergraph networks used in simu- 
lations. Section [TTT] introduces the three-player Prisoner's 
Dilemma game. In Section [TV] the simulation setup is de- 
scribed. Section [V] contains results obtained from com- 
puter simulations and their discussion. Finally, in Section 
IVII the final conclusions are drawn. 
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II. HYPERGRAPHS AND HYPERGRAPH 
NETWORKS 

We assume a hypergraph [5] network H(X, E) where 
X is a set of nodes and E is a set of non-empty subsets 
of X, E C 2 X . Elements of E are the hyperedges of H. 
We keep within the boundaries of the case when every 
subset of X, A € E satisfies \A\ — 3, i.e. every edge of 
the hypergraph connects three nodes exactly. Hereafter 
we will refer to this structure as a 3-hypergraph. We set 
N = \X\ - the total number of agents. 

We construct two types of networks: a random net- 
work, in which all hyperedges connect random nodes and 
a SF-like [TO] network. We set the number of hyperedges 
in the random case to \E\ — 10000. The SF-like network 
is constructed in the following way: First, a network of 
too <C N all connected nodes is created. Then a new 
node with to < to links is added to the network. For 
each of the to links, a pair of unique nodes is chosen from 
the existing network and a new hyperedge is added. The 
probability of a node i being chosen is given by: 

k- 

where k is the degree of a node. This procedure is re- 
peated until the number of nodes of the network reaches 
N. 



III. THREE-PLAYER PD GAME 

The classical Prisoner's Dilemma game is as follows: 
two players can either cooperate (C) or defect (D). When 
they both cooperate, each receives a payoff of 3. On the 
other hand, when they both defect, each receives a payoff 
of 1. When one defects, he/she receives a payoff of 5, 
while the other gets 0. 

This approach can be extended to a greater number 
of players. In the three-player case, the payoff matrix is 



shown in Table [T] We can see that every player is bet- 



IV. SIMULATIONS 
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TABLE I: The payoff matrix of the three-player PD 
game (after [H]). The first entry is the payoff of Alice, 
the second denotes the payoff of Bob and the third 
represents the payoff of Charlie. 

ter off defecting than cooperating no matter what the 
other players do. In terms of game theory, (D, D, D) 
is the unique Nash equilibrium of the game. If any 
one player deviates from this strategy, he will receive a 
lower payoff. On the other, we can see that the strategy 
profile(C, C, C) can yield a higher payoff than (D, D, D). 
In terms of game theory this profile is Pareto Optimal. 
In our case the players are rational and the game will end 
in (D, D, D), not (C, C, C); hence the dilemma. 

In the quantum case the setup is as follows. Each 
player is sent a qubit and can locally operate on it, using 
any unitary operator U £ SU(2). The initial state of the 
system is entangled: 



|0) = J|000), 
where J is the entangling operator |12j : 



J 



1®* + ia ®N 



(2) 



(3) 



The quantum circuit for the game is shown in Figure [T] 
After the players have applied their respective strategies, 
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FIG. 1: Quantum circuit for the three-player PD game. 
Ua, Ub, Uc are the strategies of Alice, Bob and Charlie 
respectively. 

the untangling gate, j\ is applied to the system, hence 
the final state of the game is 



\$ f ) = ,p(U A ®U b ® U c )J\000), 



(4) 



where Ua,Ub, Uc are the players strategies. The payoff 
of the first player (Alice) amounts to: 



i,j,fee{o,i} 



(5) 



where pijk are numbers corresponding to the possible 
classical payoffs of Alice, defined in Table [T] 



We assume an initial population of 2500 agents, located 
at the nodes of the hypergraph. The SF-like network is 
constructed with initial size mo = 3, and the number of 
links of each new node is m — 2. The set of allowed 
strategies is as follows [6]: 



S = {C,D,H,Q}, 



(6) 



where the unitary operators corresponding to each of the 
strategies take the form of:: 



C = 
H = 
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In the two-player case the strategy profile (Q,Q) is a Nash 
Equilibrium of the system in. These strategics are ran- 
domly assigned to agents in the network in such a way 
that the initial fractions of strategies C, D, H, Q are 
49%, 49%, 1%, 1% respectively. 

Next, we introduce an additional strategy S, defined 
as: 



I] 1 

-1 



(8) 



The strategy profile (£, S, S) is Pareto Optimal and a 
Nash Equilibrium [7]. We assign the strategies C, D, H, 
Q, E with frequencies 48%, 48%, 2%, 1%, 1%. 

Finally, we do not assign strategies randomly, but 
choose to allocate the strategy Q in the first case and 
E in the second one to nodes with the highest degree. 

The PD game is played by all agents on both networks. 
We study the impact of the value of the parameter T 
(moral hazard) on the final state of the population. This 
parameter is defined as the first players payoff when other 
players use the C strategy. Its interpretation is as follows. 
Suppose the prisoners had a chance to discuss a strategy. 
It is evident that they should decide for a Pareto Optimal 
profile (C, C, C). However, if Alice decides to defect, she 
receives a higher payoff. Thus this parameter measures, 
how much Alice is tempted to betray the other prisoners. 

The game is played for 10000 generations and the last 
1000 results are stored. Average frequencies of strategies 
are used as the final results. If a population does not 
change for 500 generations, the state is considered to be 
an equilibrium state of the system. 



V. RESULTS AND DISCUSSION 

In the case with four possible strategies, the results of 
computer simulations are depicted in Figure [2j Figure 
[2a] shows the results for a random network, whereas the 
results for the SF-like network are shown in Figure |2b| 
In the case of a random network, we see that strategy C 



is the dominant one, until T = 5.64, when the network 
starts shifting between strategies C and D. It settles 
down at T = 6, where about half the agents use strategy 
C. As T increases, strategies C and D slowly lose their 
significance in favour of strategy Q. For T > 8 the system 
reaches another equlibrium state, where strategies D and 
Q have the same frequency. 



8, there is another shift in strategies, and the fraction of 
strategy C decreases to zero, and strategies D and Q 
are used by equal fraction of agents. On examining the 
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FIG. 2: Results for PD on hypergraph networks, 4 
strategies, strategies assigned at random, according to 
weights. 

In the case of a SF-like network, the agents prefer the 
D strategy, which almost never reaches zero frequency. 
Again, for T > 6 we have an increase of significance of 
the quantum strategy Q. Although there are some oscil- 
lations of the fraction of strategies as T increases, again 
strategies D and Q have been adopted by approximately 
the same fraction of agents. On the basis of the presented 
figures as well as above discussion it may be inferred that 
the change of type of the network significantly decreases 
the importance of strategy C, but does not have a great 
impact on strategies D and Q. 

Figure[3]illustrates the results obtained for five possible 
strategies. Figure 3a illustrates the results for a random 



network, and Figure 3b] shows the results for the SF-like 
network. The examination of Fig ure|3a| reveals that it has 
the same character as the Figure |2a| except that for T < 
6 the dominant strategy is £ not C. At around T = 6, the 
network shifts from £ dominated to a network with three 
possible strategies: C, D, Q. As T increases, strategies C 
and D lose their significance in favor of Q. At around T = 
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FIG. 3: Results for PD on hypergraph networks, 5 
strategies, strategies assigned at random, according to 
weights. 

SF-like network, we again perceive behaviour similar to 
the four strategy case, but with much less oscillations. 
Again for T > 6 it is observed that strategies D and Q are 
used by approximately the same number of agents. From 
the above discussion we observe that the introduction of 
strategy E into the network results in two observations. 
Firstly, for a random network it is the dominant strategy 
for low Temptations. Secondly, for a SF-like network, it 
stops some oscillations of the network. 

Next we move on to the case, where only one agent, 
with the highest degree was assigned a quantum strat- 
egy. For the case of four available strategies, results are 
shown in Figure [4] Figure 4a shows the results for a 
random network and Figure |4b| shows the results for a 
SF-like network. In this case the agent with the highest 
degree was assigned the Q strategy, all other strategies 
were distributed to the agents with equal probabilities. 
We perceive, that for T < 6 the strategy C dominates 
the network. Again, at around T = 6 there is a shift, but 
this time the strategy H increases its significance. The 
fraction of agents using strategy H slowly increases with 
T increasing. In the case of a SF-like network, we obtain 
that for T < 6.5 the strategy C dominates the network. 
For greater T the network starts shifting from strategy 
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FIG. 4: Results for PD on hypergraph networks, 4 
strategies, strategy Q assigned to the node with highest 
degree. 
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FIG. 5: Results for PD on hypergraph networks, 5 
strategies, strategy E assigned to the node with highest 
degree. 



C to H, still a small fraction of agents also use the Q 
strategy. Summing up this case, we can conclude that 
strategy Q cannot infect any of the networks, but assign- 
ing the strategy H to a relatively big fraction of agents 
allows it to dominate the network for some values of T. 

Finally, we show the results for the case with five possi- 
ble strategies. Now we assign the strategy E to the agent 
with the highest degree. The results obtained in this case 
are shown in Figure [5] Figure 5a illustrates the results 
for a random network and Figure |5b| shows the results 
for a SF-like network. For a random network, as can 
be seen still that for T < 6 the C strategy is employed 
by all of the agents. As T increases from 6 to 8, the 
strategies C, Q and D are used by a significant fraction 
of agents. At around T = 8 the strategy H dominates 
the network. Then, just before T reaches 9, there is an- 
other sudden shift and strategies D and Q are used by 
the same fraction of agents, with other strategies being 
far less significant. In the case of SF-like network, we 
observe an entirely different behaviour. The strategy E 
always dominates the network, regardless of the value of 
T. From this discussion it is evident that the E strategy 
can only invade a network of a specific type. A random 
network is immune to invasion. 



VI. CONCLUSIONS 



We investigate the evolution of strategies on hyper- 
graph networks when quantum strategies H, Q and E 
are available to the players. Strategies Q and E are 
considered to be invaders in our scenario. Our simu- 
lations of the evolution of strategies on a random and 
SF-like hypergraph network indicate that the structure 
of the network is a decisive factor. In addition, we discov- 
ered that, the strategy E, despite being Pareto Optimal 
and a Nash Equilibrium for the three-player Prisoner's 
Dilemma game, does not invade the entire network in all 
cases. In fact, it can only invade a SF-like network, pro- 
vided that the agent with the highest degree is assigned 
this strategy. In other cases, depending on the value of 
Temptation, the network is dominated by strategy C, 
what happens for T < 6, or strategies D and Q have 
equal frequencies what happens for T > 8. The results 
obtained for the case with four available strategies, are 
slightly different. In this case the strategy Q is consid- 
ered to be an invader. The results show that a random 
network is invaded not by strategy Q, but by strategy 
H for T > 6. On the other hand the SF-like network 
constantly shifts between C and H for T > 6. 
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